Novel Definition and Algorithm for Chaining Fragments with Proportional Overlaps
نویسندگان
چکیده
Chaining fragments is a crucial step in genome alignment. Existing chaining algorithms compute a maximum weighted chain with no overlaps allowed between adjacent fragments. In practice, using local alignments as fragments, instead of Maximal Exact Matches (MEMs), generates frequent overlaps between fragments, due to combinatorial reasons and biological factors, i.e., variable tandem repeat structures that differ in number of copies between genomic sequences. In this article, in order to raise this limitation, we formulate a novel definition of a chain, allowing overlaps proportional to the fragments lengths, and exhibit an efficient algorithm for computing such a maximum weighted chain. We tested our algorithm on a dataset composed of 694 genome pairs and accounted for significant improvements in terms of coverage, while keeping the running times below reasonable limits. Moreover, experiments with different ratios of allowed overlaps showed the robustness of the chains with respect to these ratios. Our algorithm is implemented in a tool called OverlapChainer (OC), which is available upon request to the authors.
منابع مشابه
Optimal Portfolio Allocation based on two Novel Risk Measures and Genetic Algorithm
The problem of optimal portfolio selection has attracted a great attention in the finance and optimization field. The future stock price should be predicted in an acceptable precision, and a suitable model and criterion for risk and the expected return of the stock portfolio should be proposed in order to solve the optimization problem. In this paper, two new criterions for the risk of stock pr...
متن کاملMatch Chaining Algorithms for cDNA Mapping
We propose a new algorithm called the MCCM (Match Chaining-based cDNA Mapping) algorithm that allows mapping cDNAs to the genomes efficiently and accurately, utilizing local matches called MUMs (maximal unique matches) or MRMs (maximal rare matches) obtained with suffix trees. From the MUMs (or MRMs), our algorithm selects appropriate matches which are related to the cDNA mapping. We call the s...
متن کاملOptimal intelligent control for glucose regulation
This paper introduces a novel control methodology based on fuzzy controller for a glucose-insulin regulatory system of type I diabetes patient. First, in order to incorporate knowledge about patient treatment, a fuzzy logic controller is employed for regulating the gains of the basis Proportional-Integral (PI) as a self-tuning controller. Then, to overcome the key drawback of fuzzy logic contro...
متن کاملChaining Multiple - Alignment Fragments in Sub - Quadratic
We describe a multiple-sequence alignment algorithm for determining the highest-scoring alignment that can be obtained by chaining together non-overlapping subalignments selected from a given collection of such \fragments". For a given set of K sequences, a problem instance consists of a set of F precomputed fragments, an alignment score for each fragment, and a \gap" penalty function that assi...
متن کاملA multi-stage stochastic programming for condition-based maintenance with proportional hazards model
Condition-Based Maintenance (CBM) optimization using Proportional Hazards Model (PHM) is a kind of maintenance optimization problem in which inspections of a system relevant to its failure rate depending on the age and value of covariates are performed in time intervals. The general approach for constructing a CBM based on PHM for a system is to minimize a long run average cost per unit of time...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Journal of computational biology : a journal of computational molecular cell biology
دوره 18 9 شماره
صفحات -
تاریخ انتشار 2010